Search CORE

1,248 research outputs found

Feature-based time-series analysis

Author: Fulcher Ben D.
Publication venue
Publication date: 01/10/2017
Field of study

This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.Comment: 28 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Highly comparative feature-based time-series classification

Author: Fulcher Ben D.
Jones Nick S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2014
Field of study

A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on very large datasets containing long time series or time series of different lengths. For many of the datasets studied, classification performance exceeded that of conventional instance-based classifiers, including one nearest neighbor classifiers using Euclidean distances and dynamic time warping and, most importantly, the features selected provide an understanding of the properties of the dataset, insight that can guide further scientific investigation

arXiv.org e-Print Archive

CiteSeerX

Never a Dull Moment: Distributional Properties as a Baseline for Time-Series Classification

Author: Bryant Annie G.
Fulcher Ben D.
Henderson Trent
Publication venue
Publication date: 31/03/2023
Field of study

The variety of complex algorithmic approaches for tackling time-series classification problems has grown considerably over the past decades, including the development of sophisticated but challenging-to-interpret deep-learning-based methods. But without comparison to simpler methods it can be difficult to determine when such complexity is required to obtain strong performance on a given problem. Here we evaluate the performance of an extremely simple classification approach -- a linear classifier in the space of two simple features that ignore the sequential ordering of the data: the mean and standard deviation of time-series values. Across a large repository of 128 univariate time-series classification problems, this simple distributional moment-based approach outperformed chance on 69 problems, and reached 100% accuracy on two problems. With a neuroimaging time-series case study, we find that a simple linear model based on the mean and standard deviation performs better at classifying individuals with schizophrenia than a model that additionally includes features of the time-series dynamics. Comparing the performance of simple distributional features of a time series provides important context for interpreting the performance of complex time-series classification models, which may not always be required to obtain high accuracy.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

Tracking the distance to criticality in systems with unknown noise

Author: Fulcher Ben D.
Gollo Leonardo L.
Harris Brendan
Publication venue
Publication date: 24/10/2023
Field of study

Many real-world systems undergo abrupt changes in dynamics as they move across critical points, often with dramatic and irreversible consequences. Much of the existing theory on identifying the time-series signatures of nearby critical points -- such as increased signal variance and slower timescales -- is derived from analytically tractable systems, typically considering the case of fixed, low-amplitude noise. However, real-world systems are often corrupted by unknown levels of noise which can obscure these temporal signatures. Here we aimed to develop noise-robust indicators of the distance to criticality (DTC) for systems affected by dynamical noise in two cases: when the noise amplitude is either fixed, or is unknown and variable across recordings. We present a highly comparative approach to tackling this problem that compares the ability of over 7000 candidate time-series features to track the DTC in the vicinity of a supercritical Hopf bifurcation. Our method recapitulates existing theory in the fixed-noise case, highlighting conventional time-series features that accurately track the DTC. But in the variable-noise setting, where these conventional indicators perform poorly, we highlight new types of high-performing time-series features and show that their success is underpinned by an ability to capture the shape of the invariant density (which depends on both the DTC and the noise amplitude) relative to the spread of fast fluctuations (which depends on the noise amplitude). We introduce a new high-performing time-series statistic, termed the Rescaled Auto-Density (RAD), that distils these two algorithmic components. Our results demonstrate that large-scale algorithmic comparison can yield theoretical insights and motivate new algorithms for solving important practical problems.Comment: The main paper comprises 18 pages, with 5 figures (.pdf). The supplemental material comprises a single 4-page document with 1 figure (.pdf), as well as 3 spreadsheet files (.xls

arXiv.org e-Print Archive

Highly comparative time-series analysis: The empirical structure of time series and their methods

Author: Fulcher Ben D.
Jones Nick S.
Little Max A.
Publication venue: 'The Royal Society'
Publication date: 08/03/2013
Field of study

The process of collecting and organizing sets of observations represents a common theme throughout the history of science. However, despite the ubiquity of scientists measuring, recording, and analyzing the dynamics of different processes, an extensive organization of scientific time-series data and analysis methods has never been performed. Addressing this, annotated collections of over 35 000 real-world and model-generated time series and over 9000 time-series analysis algorithms are analyzed in this work. We introduce reduced representations of both time series, in terms of their properties measured by diverse scientific methods, and of time-series analysis methods, in terms of their behaviour on empirical time series, and use them to organize these interdisciplinary resources. This new approach to comparing across diverse scientific data and methods allows us to organize time-series datasets automatically according to their properties, retrieve alternatives to particular analysis methods developed in other scientific disciplines, and automate the selection of useful methods for time-series classification and regression tasks. The broad scientific utility of these tools is demonstrated on datasets of electroencephalograms, self-affine time series, heart beat intervals, speech signals, and others, in each case contributing novel analysis techniques to the existing literature. Highly comparative techniques that compare across an interdisciplinary literature can thus be used to guide more focused research in time-series analysis for applications across the scientific disciplines

arXiv.org e-Print Archive

CiteSeerX

PubMed Central

Spiral - Imperial College Digital Repository

Recommended from our members

A Physiologically Based Model of Orexinergic Stabilization of Sleep and Wake

Author: Fulcher Ben D.
Phillips Andrew J. K.
Postnova Svetlana
Robinson Peter A.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/04/2014
Field of study

The orexinergic neurons of the lateral hypothalamus (Orx) are essential for regulating sleep-wake dynamics, and their loss causes narcolepsy, a disorder characterized by severe instability of sleep and wake states. However, the mechanisms through which Orx stabilize sleep and wake are not well understood. In this work, an explanation of the stabilizing effects of Orx is presented using a quantitative model of important physiological connections between Orx and the sleep-wake switch. In addition to Orx and the sleep-wake switch, which is composed of mutually inhibitory wake-active monoaminergic neurons in brainstem and hypothalamus (MA) and the sleep-active ventrolateral preoptic neurons of the hypothalamus (VLPO), the model also includes the circadian and homeostatic sleep drives. It is shown that Orx stabilizes prolonged waking episodes via its excitatory input to MA and by relaying a circadian input to MA, thus sustaining MA firing activity during the circadian day. During sleep, both Orx and MA are inhibited by the VLPO, and the subsequent reduction in Orx input to the MA indirectly stabilizes sustained sleep episodes. Simulating a loss of Orx, the model produces dynamics resembling narcolepsy, including frequent transitions between states, reduced waking arousal levels, and a normal daily amount of total sleep. The model predicts a change in sleep timing with differences in orexin levels, with higher orexin levels delaying the normal sleep episode, suggesting that individual differences in Orx signaling may contribute to chronotype. Dynamics resembling sleep inertia also emerge from the model as a gradual sleep-to-wake transition on a timescale that varies with that of Orx dynamics. The quantitative, physiologically based model developed in this work thus provides a new explanation of how Orx stabilizes prolonged episodes of sleep and wake, and makes a range of experimentally testable predictions, including a role for Orx in chronotype and sleep inertia

Harvard University - DASH

Directory of Open Access Journals

FigShare

Finding binaries from phase modulation of pulsating stars with \textit{Kepler}: VI. Orbits for 10 new binaries with mischaracterised primaries

Author: Barbara Nicholas H.
Bedding Timothy R.
Fulcher Ben D.
Hey Daniel
Murphy Simon J.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 04/03/2020
Field of study

Measuring phase modulation in pulsating stars has proved to be a highly successful way of finding binary systems. The class of pulsating main-sequence A and F variables known as delta Scuti stars are particularly good targets for this, and the \textit{Kepler} sample of these has been almost fully exploited. However, some \textit{Kepler}

\delta

Scuti stars have incorrect temperatures in stellar properties catalogues, and were missed in previous analyses. We used an automated pulsation classification algorithm to find 93 new

\delta

Scuti pulsators among tens of thousands of F-type stars, which we then searched for phase modulation attributable to binarity. We discovered 10 new binary systems and calculated their orbital parameters, which we compared with those of binaries previously discovered in the same way. The results suggest that some of the new companions may be white dwarfs.Comment: 8 pages, 6 figures that make liberal use of colou

arXiv.org e-Print Archive

University of Southern Queensland ePrints

On the information-theoretic formulation of network participation

Author: Agius Dominic
Cajic Pavle
Cliff Oliver M.
Fulcher Ben D.
Lizier Joseph T.
Shine James M.
Publication venue
Publication date: 24/07/2023
Field of study

The participation coefficient is a widely used metric of the diversity of a node's connections with respect to a modular partition of a network. An information-theoretic formulation of this concept of connection diversity, referred to here as participation entropy, has been introduced as the Shannon entropy of the distribution of module labels across a node's connected neighbors. While diversity metrics have been studied theoretically in other literatures, including to index species diversity in ecology, many of these results have not previously been applied to networks. Here we show that the participation coefficient is a first-order approximation to participation entropy and use the desirable additive properties of entropy to develop new metrics of connection diversity with respect to multiple labelings of nodes in a network, as joint and conditional participation entropies. The information-theoretic formalism developed here allows new and more subtle types of nodal connection patterns in complex networks to be studied

arXiv.org e-Print Archive